home *** CD-ROM | disk | FTP | other *** search
Text File | 1998-09-09 | 23.8 KB | 725 lines | [TEXT/MPS ] |
- ========================================================================
- Metrowerks MPW Tools Release Notes
- ========================================================================
-
- Version: 2.2 ( __MWERKS__ == 0x2200 )
- Date: September 9, 1998
- Authors: Fred Peterson - MPW command line interface
- Ed Swartz - MPW command line interface
- Andreas Hommel - C/C++ frontends; Classic 68K backend and linker
- Berardino Baratta - CFM68K backend and linker
- Bob Campbell, Andy Nicholas - PPC backend
-
- ========================================================================
-
- The precompiled header files will have to be rebuilt.
-
-
- ========================================================================
- New Features in This Version
- ========================================================================
-
- MWC68K and MWCPPC
- * The interface for specifying optimizations has been changed. The
- new interface is based on levels numbered 0 through 4. The new switches
- are
-
- -opt l0 # set global optimization level to 0
- -opt l1 # set global optimization level to 1
- -opt l2 # set global optimization level to 2
- -opt l3 # set global optimization level to 3
- -opt l4 # set global optimization level to 4
-
- These switches were previously in use in MWCPPC, but now they
- control more optimizations. They also apply to MWC68K.
-
- You may still use the following switches, but they are deprecated.
- You should transition to using the -opt l[0-4] switches because these
- old switches are not guaranteed to work in future releases.
-
- -opt [no]global # enable global optimizations
- -opt [no]cse # enable common sub expression optimization
- -opt [no]deadcode # enable removal of unreachable code
- -opt [no]deadstore # enable removal of dead assignments
- -opt [no]lifetimes # enable computation of variable lifetimes
- -opt [no]loop # enable removal of loop invariants
- -opt [no]propagation # enable propagation of constant and copy assignments
- -opt [no]strength # enable reduction of multiplication by an index variable
- # to addition
-
- The following switches are deprecated only for MWCPPC:
-
- -opt [no]schedule601 # schedule instructions for 601
- -opt [no]schedule603 # schedule instructions for 603
- -opt [no]schedule603e # schedule instructions for 603e
- -opt [no]schedule604 # schedule instructions for 604
- -opt [no]schedule604e # schedule instructions for 604e
- -opt [no]schedule750 # schedule instructions for 750
-
- Instead, you should use -opt schedule -processor generic|601|603|603e|604|604e|750.
- The processor specification has been decoupled from the scheduling switch because
- in the future other optimizations or compiler functionality may also depend on
- the processor.
-
- The following switches are deprecated only for MWC68K:
-
- -opt [no]color # enable register coloring
- -opt [no]peep # enable peephole optimization
-
- Combining the new and old optimization switches can produce a usage warning:
-
- MWC68K source.c -opt nocolor,l4
- ### MWC68K Usage Warning:
- # '-opt l4' may override other previously specified optimization options
-
- Old options should be specified after the new options to be always effective
-
- MWC68K source.c -opt l4,nocolor
-
- There are pragmas to control some optimizations if needed. The
- following pragmas work for PowerPC, 68K and x86:
-
- #pragma opt_common_subs --> control common subexpression elimination
- #pragma opt_loop_invariance --> control loop invariant removal
- #pragma opt_propagation --> control copy and constant propagation
- #pragma opt_lifetimes --> control lifetime analysis
- #pragma opt_deadcode --> control dead code elimination
- #pragma opt_dead_assignments --> remove dead assignments
-
- In addition there are 3 new powerpc specific pragmas which default
- to the following (more details on these below).
-
- #pragma ppc_unroll_instructions_limit 100
- #pragma ppc_unroll_factor_limit 10
- #pragma ppc_unroll_speculative off
-
- * support for member templates
-
- Limitations:
-
- conversion member function templates are not supported
-
- member templates and member template members cannot be defined
- outside of their class definition.
-
- Member template functions are not automatically inline unless
- they are explicitly declared to be inlined. For example:
-
- struct X {
- template <class T> void f(T) {};
- template <class T> inline void g(T) {};
-
- template <class U> struct N {
- void h(U) {};
- };
- };
-
- only X::g and X::N::g are inlined but X::f will not be inlined.
- This can be used to prevent coad bloat until out-of-line
- declarations are supported.
-
- * support for class template partial specializations
-
- * support for virtual function overrides with covariant return types:
-
- class foo {
- public:
- virtual const foo *func();
- };
-
- class bar : public foo , public foo2 {
- public:
- virtual bar *func(); // OK, covariant return type
- };
-
- * support for partial template function ordering
-
- * support for function-try-blocks
-
- * support for deferred code generation
-
- #pragma defer_codegen on|off|reset (default: off)
-
- This option allows inlining of 'inline' and 'auto-inline'
- functions that are called before their definition:
-
- #pragma defer_codegen on
- #pragma auto_inline on
-
- extern void f();
- extern void g();
-
- main()
- {
- f(); // will be inlined
- g(); // will be inlined
- }
-
- inline void f() {}
- void g() {}
-
- The compiler will need more memory when this option is selected.
- The command line option is "-defer_codegen on|off".
-
-
- * the ANSI C compiler now allows pointer -> pointer-size integral
- conversions in global initializations if "ANSI strict" is not
- selected.
-
- char c;
- long arr = (long)&c; // accepted (not ANSI C)
-
- * support for throwing exceptions in conditional expressions:
-
- int foo(bool cond)
- {
- return cond ? 1 : throw "oops";
- }
-
- * an import / export __declspec after a class/struct keyword is now
- also supported in non-Win ABI compilers.
-
- * the preprocessor now includes some info comments about the
- current include file in preprocessor dumps. These comments can be
- disabled via a '#pragma simple_prepdump on'.
-
- * support for two new built-in functions:
-
- primary_expression:
- __builtin_align ( <type-id> )
- __builtin_type ( <type-id> )
-
- '__builtin_align' returns the byte-alignment of <type-id>.
- '__builtin_align' returns a type specific value (integral/enum: 0,
- floating point: 1, other 2).
-
- * the size limit for fully expanded macros has been increased to 128K
- characters (was 32K).
-
- * better code when class type exceptions are throw
-
- * improved template function inlining
-
- * updated linkage of 'inline' functions to latest specs
-
- * the 'implicit int' rule is no longer supported in C++
-
- f(int a) // error: implicit 'int'
- {
- return a+1;
- }
-
- * #pragma suppress_init_code on|off|reset (default: off)
-
- This #pragma can be used to suppress any static data initialization
- code generation (eg constructor calls). This should not be used
- unless you really know what you are doing.
-
- * #pragma reverse_bitfields on|off|reset (default: off)
-
- This #pragma reverses the bitfield allocation.
-
- * The WinABI alignment #pragma (#pragma pack) and alignment modes
- are now also supported the the MacOS PPC and 68K compilers.
-
- * obsolete #pragmas
-
- static_inlines
- direct_destruction
-
- * New/changed error messages:
-
- "inconsistent use of 'class' and 'struct' keywords"
-
- #pragma warn_structclass on | off | reset
-
- #pragma warn_structclass on
- class X;
- struct X { int a; }; // warning: inconsistent use...
-
- ------
-
- "variable '%u' is not used in function",
-
- has been changed to:
-
- "variable / argument '%u' is not used in function",
-
- ------
-
- /* [245] */
- "illegal partial specialization",
- /* [246] */
- "illegal partial specialization argument list",
- /* [247] */
- "ambiguous use of partial specialization",
- /* [248] */
- "local classes shall not have member templates",
- /* [249] */
- "illegal template argument dependent expression",
- /* [250] */
- "implicit 'int' is no longer supported in C++",
-
-
- MWC68K
- * #pragma huge_switch on | off | reset (default off)
-
- this #pragma can be used to avoid "label out of range error" in
- functions with huge switch statements (>32K).
-
- * improved code generation
-
-
- MWCPPC
- * Implement branchless compares which don't use the condition code
- fields. These optmizations are explained in "The PowerPC(tm)
- Compiler Writer's Guide" Appendix D (D.1 Comparisons and
- Comparisons Aginst Zero)
-
- a = b == c;
-
- * Implement "!" with-out branches if the expression is not a "&&" or
- "||". In effect "!" for a value in a register is the same as r == 0.
- The code generated is:
-
- cntlwz Ry,Rvalue
- srwi Rresult,Ry
-
- * Check for the "!!" case (which sometimes happens with inlining)
- !(!(x)) is the same as (x) iff x is a logical expression.
-
- * Optimize 16 or 8 bit math to remove the EXTS[HB] or RLWINM instructions
- when it can be proven that they are not needed (LHZ does not need to be
- followed by a RLWINM, and LHA does not need to be followed by a EXTSH
- etc.)
-
- * Extensive Optimization Improvements
- + Loop unrolling for fixed count loops has been extended to
- handle loops where the body of the loop contains conditional
- code.
-
- + Controls for loop unrolling have been make externally setable
- using the pragmas "ppc_unroll_instructions_limit" and
- "ppc_unroll_factor_limit". The default values for these are:
-
- #pragma ppc_unroll_instructions_limit 100
- #pragma ppc_unroll_factor_limit 10
-
- The factor limit controls the max number of copies of the
- loop body which will be generated. The instruction limit
- controls the total size of the unrolled loop body. (It should
- be noted that even if the limit is 100 the result will normally
- be smaller as other optimizations generally will reduce the size
- of the loop body after it has been unrolled).
-
- + Speculative unrolling, for a detected counting loop, where the
- number of iterations is not a compile time constant but can
- be calculated at runtime, the loop is speculatively unrolled.
- For this to work the loop counter must be a 32 bit value (int,
- long, unsigned int, unsigned long), the body of the loop must
- not contain any conditional code.
-
- This feature can be disabled by using the pragma
- "ppc_unroll_speculative"
-
- #pragma ppc_unroll_speculative off
-
- The unroll factor for speculative unrolling will be a power of
- 2 (so if unroll factor limit is 10, it will try 8, 4 and 2).
-
- + Loops containing a single conditional which is loop invariant
- are unswitched:
-
- for (i = init; i < limit; i++) {
- ... pre if statements ...
- if (loop-invariant-expression) {
- ... true statements ...
- } else {
- ... false statements ...
- }
- ... post if statements ...
- }
-
- becomes:
-
- if (loop-invariant-expression) {
- for (i = init; i < limit; i++) {
- ... pre if statements ...
- ... true statements ...
- ... post if statements ...
- }
- } else {
- for (i = init; i < limit; i++) {
- ... pre if statements ...
- ... false statements ...
- ... post if statements ...
- }
- }
-
- + Treat as counting loops those loops where the condition is BNE
- provided the increment is 1 or -1.
-
- + Improve handling of induction variables, detect nested induction
- variables which can be merged
-
- int a[10][10];
-
- for (i = 0; i < 10; i++) {
- for (j = 0; j < 10; j++) {
- a[i][j]
- }
- }
-
- detect that a[i][j] is a single induction variable (instead of
- two (a[i], and a[i][j])). This only works if the loop is over
- all elemements of the first subscript (otherwise the value of
- a[i] needs to be recomputed at the beginning of the inner loop).
-
- Also detect induction variables which are simple offsets from
- each other (like fields of a structure). If possible update
- the load/store instructions to use a single induction variable
- with the offsets encoded into the load/store instruction.
-
- + Extended Constant Propagation to handle ADD => ADDI, OR => ORI,
- SUB => ADDI or SUBFIC (depending on which paramter is constant).
- (These patterns were handled in the peephole, but Constant
- Propagation works across blocks).
-
- + ADDI ... Load/Store; when ever possible the constant part of an
- ADDI is propagated into the Load/Store instruction, and the ADDI
- is defered as late in the basic block as possible. For example
-
- int *p;
-
- *p++ = 0; *p++ = 0; *p++ = 0; *p++ = 0;
-
- Produces something like:
-
- LI R0,0
- STW R0,0(Rp)
- ADDI Rp,Rp,4
- STW R0,0(Rp)
- ADDI Rp,Rp,4
- STW R0,0(Rp)
- ADDI Rp,Rp,4
- STW R0,0(Rp)
- ADDI Rp,Rp,4
-
- This is now optimized to:
-
- LI R0,0
- STW R0,0(Rp)
- STW R0,4(Rp)
- STW R0,8(Rp)
- STW R0,12(Rp)
- ADDI Rp,Rp,12
-
- When possible the last add will be deleted if the value is not live,
- if the value is live and the ADDI can be folded into the last Load
- or Store the update form of the load/store will be used. This general
- optimization will work on other forms (so *--p, *++p etc will be
- handled).
-
- + Peephole optimizer now keeps track of the usage of condition code
- fields so that compares when the result would be constant can be
- removed. In the following example the BEQ can become a "B" or be
- deleted, in addition the CMP (and perhaps the LI) can be deleted
- if their results are not live.
-
- LI Rx,k; CMP[L]I CRx,Rx,l; B[EQ|NE] CRx,label
- ==>
- if branch condition is true:
- B label
- if branch condition is false:
- B nextblock
-
- *Note if the branch is to the next block the final assembly pass
- will optimize the branch away
-
- + Added peephole patterns:
- LBZX Ry,(Rx,Rw); RLWINM Rz,Ry,0,a,31; => LBZX Rz,(Rx,Rw)
- iff a <= 24, and Ry is not live after the RLWINM
-
- LHZX Ry,(Rx,Rw); RLWINM Rz,Ry,0,a,31; => LHZX Rz,(Rx,Rw)
- iff a <= 16, and Ry is not live after the RLWINM
-
- NOT Ry,Rx; AND Rz,Rw,Ry => ANDC Rz,Rw,Rx
- iff Ry is not live after the ADN.
-
- + Extensive improvements for ADDI ... Load/Store patterns
-
-
- ========================================================================
- Bugs Fixed in This Version
- ========================================================================
-
- MWC68K, MWLink68K, and MWDump68K
- * the -palmos option has been reenabled. It was accidentally disabled
- in version 2.1.
-
-
- MWC68K and MWCPPC
- * the compiler no longer accepts return types in conversion functions
-
- * the compiler no longer accepts non-member conversion functions
-
- * fixes a bug with binding temporaries to non-const references during
- overload matching
-
- * the C++ compiler no longer allows implicit conversions from constant
- enumeration expressions to a pointer type.
-
- * unnamed namespace members no longer require a prototype when the
- "require function prototype" option is selected.
-
- * the compiler no longer generates 'implicit arithmetic conversion'
- warnings when the left side of an assignment is a bitfield.
-
- * functions that are defined inside of an Objective C class
- implementation now have access to private and protected members.
-
- * fixes a qualified non-static member compile-time access bug in
- certain class trees with nested qualified base classes.
-
- * fixes a bug with reinterpret reference casts
-
- * fixes a preprocessor bug with 'defined' in macros that are expanded
- inside of #if and #elif expressions
-
- * fixes a dynamic initialization bug where static template class members
- where initialized more than once (MW05797).
-
- * fixes a template class instance bug in SYM files (MW07582).
-
- * fixes some bugs with throwing and catching:
-
- const/volatile pointer/reference types
- ambiguous, local, and/or non-public base classes
- 'void' pointers.
- non-trivial virtual base classes (BR6787)
-
- * multiple using declarations are now accepted
-
- namespace N {
- typedef int T;
- typedef int T; // OK
- }
- using N::T;
- using N::T; // also OK
-
- * fixes a bug with class-type conditional declarations (MW03054).
-
- * non-const or non-lvalue class objects can no longer be assigned
- using a non-const assignment operator (eg X& operator(X&))
-
- * fixes a bug with packed 1-byte ANSI C unions
-
-
- MWC68K
- * MW07038 - Internal compiler error : File 'IroLoop.c' Line 3354.
- Previous workaround was to turn off Reduction in Strength
- optimization.
-
- * MW07444 - Bug in the IR optimizer.
-
- * fixes an "illegal use of alloca() in function argument" error
- message bug.
-
- * fixes a struct return bug with member functions where the class
- is a trivial temporary object.
-
-
- MWCPPC
- * MW09350: Unrolling loops which contain a conditional
- expression which has a "continue" (or no else and code
- after the conditional part of the if):
-
- for (i = 0; i < 4; i++) {
- if (cond || cond) {
-
- }
- }
-
- * MW09249: in constant propagation incorrectly converted
-
- LI Rx,0; ... SUBF Rz,Ry,Rx ===>>> MR Rz,Rx
-
- it should have been
-
- NEG Rz,Rx
-
- * MW09181: counting loop with post decrement (or increment) when
- converting the loop (because it is known that the loop will always
- be executed once) the instructions following the compare where not
- copied into the "preheader".
-
- * MW09120 PowerPlant project won't link with PPC compiler 2.2 build 27
- The optimizer detected an invariant conditional in a loop and
- unswitched the loop, however the loop contained a call (which meant
- that there is little if any improvement with unswitching) and the loop
- was converted in such a way as to confuse the scheduler. The code now
- checks for calls (and instructions with side effects).
-
- * MW09048: Re : BUG - C++ PPC 2.2b1 + optimizations
- Fixed a bug in speculative loop unrolling when the loop counter
- was not 1 and the loop counter is not referenced in the loop body.
-
- * Fixed a bug handling the pattern
- ADD rX,rY,rZ; ADDI rW,rX,0 ==> ADD rW,rY,rZ
-
- * MW08931 Out-of-line traceback tables causes ICE
-
- * MW08861 fixes a bug which prevented disabling puttting small static
- data in the TOC.
-
- * MW08762, MW08727 - PPC backend's parser for pragmas was incorrectly
- complaining about missing EOL after unknown pragmas
-
- * MW08761 - Incorrect internal error when trying to optimize small
- local arrays to registers.
-
- * MW08611 - fixes a bug in structure copy using doubles (when one of the
- operands being copied is an array reference).
-
- * MW08481 - fixes a bug in copy propagation which was confusing the offsets
- of parameters (only effects parmeters not passed in registers)
-
- * fixes a bug in global register allocation at optimization level 2
- (with only the global optimizer "Speed" checkbox set). This bug
- only effects C++ functions with a inlined function which has to
- destroy an object in response to an exception (and even then only
- for some rare cases).
-
- * Handle a conditional expression where one of the values is a throw:
-
- var = expr ? value : throw error;
-
- * Fix the parameter types on __memcpy() and __strcpy()
-
- * Handle using !varaible as a paramter so it only generates
- two instructions (instead of 4).
-
- * MW07222 - incorrect handling of "++"
-
- * MW07294 - internal error in CExpr.c for a template expansion
-
- * MW05522 - internal compiler error in IroLoop.c Line 3354
-
- * MW07014 - float op= float
-
- * MW07015 - float = int * float
-
- * MW07164 - Internal Error Operands.c line 648
-
- * MW07345 - Try/catch wrongly optimized away
-
- * MW06832 - internal compiler error: File: 'IroLoop.c' Line: 3354
-
- * MW06770 - internal compiler error: File: 'Operands.c' Line: 648
-
- * MW07242 - In some cases, PPC Global Optimizer eliminates variables that
- are still needed
-
- * MW07014,MW07015 - loop code was detecting a float induction variable
- and not handling it correctly.
-
- * MW07345 - Common Subexpression Elimination has been fixed to not lose
- the catch block.
-
- * MW05562 - compiler error for volatile classes
-
- * MW06809 - >>= of 16-bit value shifts in bits from upper word
-
- * MW07037 - Bug report, part II
-
- * MW07080 - won't compile
-
- * MW07268 - incorrect handling of shift on byte values
-
- * MW07284 - Load Byte Reversed instructions are not treated as volatile
-
- * MW07315 - invalid disassembly
-
- * MW07395 - With Full Optimizations Compiler emits extra round to single
- instuctions
-
- * MW05562 - When fixing volatile allowances were made for supporting the
- expression:
-
- *(volatile int *) x;
-
- Which might be used to toggle an IO location and where the read should
- not be optimized away. Since the values was never assigned to a variable
- the PowerPC compiler would never actually load the value. In 2.1 (Pro 3)
- it forces the value to be loaded, but it did not check that the value
- actually was loadable. The way it does the load did not work for structs
- or classes. The backend now only loads basic "C" types (int, char, float,
- double) for this type of expression it will not load a class (or struct
- or union...).
-
- * MW06809, MW07268 - The 2.1 (Pro 3) PowerPC code generator changed the way
- that it handled "op=" expressions, this change while generally better
- required some changes in the intermedate code to insure that values were
- correct after the assignment.
-
- * MW07037 - Not reproduceable with iPro 4 (2.1.1).
-
- * MW07080 - Not reproduceable with iPro 4 (2.1.1).
-
- * MW07284 - As of iPro 4 (2.1.1) __lwbrx(), __lhbrx(), __sthbrx(), __stwbrx()
- are treated as having side effects and will never be optimized away.
-
- * MW07315 - The 68K version of the PowerPC disassembler was built incorrectly,
- and was not working correctly.
-
- * MW07395 - Compiler when doing copy propagation was adding an unneeded type
- conversion (this conversion is needed for some arch cpus but not PowerPC):
-
- float bug(float *fp, float s, float o)
- {
- float f, r;
-
- f = *fp;
- r = f * s + o;
- return r;
- }
-
- The compiler optimized this to:
-
- float bug(float *fp, float s, float o)
- {
- return (float) ((float)*fp) * s + o;
- }
-
- The extra float casting is needed for some non PowerPC CPUs (because of
- the way that float registers are handled). In the PowerPC case they were
- not needed. The code generator now ignores unneeded requests to convert
- from a float to a float.
-
- * MW07612 - built-in assembler bug fix
-
-
- ========================================================================
- Known Bugs and Incompatibilities
- ========================================================================
-
- * None.
-
-
- ========================================================================
- Contacting Metrowerks
- ========================================================================
-
- For bug reports, technical questions, and suggestions, please use the
- forms in the Release Notes folder on the CD, and send them to
-
- support@metrowerks.com
-
- See the CodeWarrior on the Nets document in the Release Notes folder for
- more contact information, including a list of Internet newsgroups,
- online services, and patch and update sites.
-
- ========================================================================
-
- Metrowerks Corporation
-